Constraint Relaxations for Discovering Unknown Sequential Patterns

نویسندگان

  • Cláudia Antunes
  • Arlindo L. Oliveira
چکیده

The main drawbacks of sequential pattern mining have been its lack of focus on user expectations and the high number of discovered patterns. However, the solution commonly accepted – the use of constraints – approximates the mining process to a verification of what are the frequent patterns among the specified ones, instead of the discovery of unknown and unexpected patterns. In this paper, we propose a new methodology to mine sequential patterns, keeping the focus on user expectations, without compromising the discovery of unknown patterns. Our methodology is based on the use of constraint relaxations, and it consists on using them to filter accepted patterns during the mining process. We propose a hierarchy of relaxations, applied to constraints expressed as context-free languages, classifying the existing relaxations (legal, valid and naïve, previously proposed), and proposing several new classes of relaxations. The new classes range from the approx and non-accepted, to the composition of different types of relaxations, like the approx-legal or the nonprefix-valid relaxations. Finally, we present a case study that shows the results achieved with the application of this methodology to the analysis of the curricular sequences of computer science students.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining Patterns Using Relaxations of User Defined Constraints

The main drawbacks of sequential pattern mining have been its lack of focus on user expectations and the high number of discovered patterns. However, the solution commonly accepted – the use of constraints – approximates the mining process to a hypothesis-testing task. In this paper, we propose a new methodology to mine sequential patterns, keeping the focus on user expectations, without compro...

متن کامل

Constraint-based sequential pattern mining: a pattern growth algorithm incorporating compactness, length and monetary

Sequential pattern mining is advantageous for several applications for example, it finds out the sequential purchasing behavior of majority customers from a large number of customer transactions. However, the existing researches in the field of discovering sequential patterns are based on the concept of frequency and presume that the customer purchasing behavior sequences do not fluctuate with ...

متن کامل

Discovering Active and Profitable Patterns with Rfm (recency, Frequency and Monetary) Sequential Pattern Mining–a Constraint Based Approach

Sequential pattern mining is an extension of association rule mining that discovers time-related behaviors in sequence database. It extends association by adding time to the transactions. The problem of finding association rules concern with intratransaction patterns whereas that of sequential pattern mining concerns with inter-transaction patterns. Generalized Sequential Pattern (GSP) mining a...

متن کامل

Mining Constraint-based Multidimensional Frequent Sequential Pattern in Web Logs

In this paper we introduce an efficient strategy for discovering Web usage mining is the application of data mining techniques to discover usage patterns from Web data, in order to understand and better serve the needs of Web-based applications. Web usage mining consists of three phases, namely preprocessing, pattern discovery, and pattern analysis. This paper describes each of these phases in ...

متن کامل

WildSpan: Efficient Discovery of Functional Motifs Spanning Large Wildcard Regions from Protein Sequences

Motivation: Automatic extraction of motifs from biological sequences is an important problem in molecular biology. For proteins, it is desired to discover sequence motifs containing large irregular gaps as the contact residues associated with a functional site are not always from one region of the sequences. Discovering such patterns is a time-consuming task due to a large number of combination...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004